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Summary. This article introduces an observable model equivalent to Hidden 
Markov Models. The model does not contain hidden part and has same major 
properties as respective HMM. OMMs also direct to the noncritical and obvious 
improvements in the major algorithms. 

1 Introduction 

We will use following notions. 

Definition 1. Oriented graph G is the pair ( EG,VG ), where EG cVGxVG. 
VG is called set of vertices and EG is called set of edges. For any edge x = (p, v) 
denote p = dom. (a:), v = cod (x) . 

Definition 2. I = [0,1]. 

Definition 3. Probability space is the triple (P,.F, P), where F C 2° is a - 
algebra of subsets of O and P : F — > R is probability measure on F ([2]). 

Definition 4. //fi either R or countable then P is called probability distribution 
on O ([3]). If P = R. then F is Borel algebra on R, which is unique, if P is 
countable then F = 2 n and therefore definition of P in these cases uniquely 
determines F and we will say that P is probability distribution on P. 

Definition 5. Let A be graph called primary graph and B be set of letters 
called alphabet or observables and V/i € V A there are probability distributions 
on B and a ^ on {(p, v ) £ EA}. b ^ is called distribution of observable for the 
vertex p and a ^ is called transition distribution from the vertex p. These two 
sets of distributions could be interpreted as functions b : V A x B — ► I and a : 
V Ax E A — > / respectively. Function t : V A — > R- probability distribution onV A 

called initial state distribution. Following [1] under HMM we will understand 
A = (a, b, l). 


In voice recognition systems it’s usually assumed that for the primary graph 
VA = {v 0 , v 1 ,v 2 ,i ' 3 , Ui} and 
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EA = {Oo, iq) , (zq, Zq) , (zq, I/ 2 ) , (zq, V 2 ) , (zq, ^ 3 ) , (^3, ^ 3 ) , (^3, ^ 4 )} , 

i.e. A is ’’chain” with loops on internal vertices and 1 (v) = < V . U ° . 

i 0, v / i/ 0 

1.1 Hidden Markov Models (HMM), basic algorithms 

There are 3 problems of HMM, but we will be interested in #1 and #3 only. 
Let T = {0, T — 1} and T^l = {0, ...,T - 2}, O t = { o t G B\t G T} - ob- 
servation sequence of letters of alphabet B , = {f/ t G V A\t Gl} - sequence 

of the states. 


Problem #1. Compute P (Oj^A), the probability that given model A gener- 
ates given sequence Ot- 

Problem #3. Adjust model A to maximize P (Ot|A) for the given Ot- 


Algorithms. For the solution of the problem #1 forward and backward algo- 
rithms could be used, each requires O (jPA| 2 • t) operations. 


Forward algorithm. Denote forward variable at (/x) = 

P (oo, ..., Ot \qt = /x, A), i.e. the probability of observation of the sequence 
{oo, and at the moment t state is /x. 

Solution. 1) V/z G V A : a 0 (/x) = i (/z) • b M ( 00 ) 

2 ) Vt G T^l, V/z G PA : 

«t+i (m) = X! M ' a ("’ (°t+i) 

\vev A/\{y,tJt)eEA ) 

3) P(Ot\X)= J2 a T - 1 ( m ) 

IJ.&VA 


Backward algorithm. Denote backward variable /? t (/z) = 

P (ot+i, ..., Ot-iI^x = /x, A), i.e. the probability of observation of the se- 
quence {ot+i, . . . , Ot— 1 } and at the moment t state is /z. 

Solution. 1) V/x G PA : /3t-i (/x) = 1 
2J Vt G T^l, V/x G PA : 

A (m) = ^2 @ t+1 M ' a (°*+i) 

1/£F A/\(n,v)EEA 

3) P (Ot|A) = E b (v) • K (oo) • /?o (m) 

HGVA 


2 



Solution of the problem ff3, Baum- Welsh algorithm. Let 'it £ 


T-l, V (n, v) € EA : 


€t(p,v) = P((q t ,q t+1 ) = (p,v) \O t , A) 

a t (^i) • a (/x, • b v (o t+ i) ■ pt+i (0 

P(Ot|A) 

a t (p) ■ a (p, v) ■ K (ot+i) • Pt+i (v) 

E a t (p) ■ a(p,cr) ■ b a (o t+ i) ■ (3 t +i(a) 

(p,cr)£EA 

Let 7 t (p) = P(q t = p\0 T , A) = E £t{p,v) 

v£V AA(fj.,v)eEA 

E 7 1 it 1 ) - expected number of transitions from the vertex p, 

teT-i 

E ft ( x ) - expected number of transitions over the edge x. 

teT-i 

Most difficult part of the solution is minimization problem based on the 
parameters: 

= 7o (p) 

E £t(n,0 

— / \ teT-i 

~ g 7t (/x) 
teT-i 

Z 7t(/d 

T / \ teT-lAo t =c 

- E 7t(/d • 

ter-i 

It could be done in many ways, but exact algorithms are not important for 
us now. 


2 Observable Markov Models (OMM) 

Definition 6. For given HMM A = (a, b, 7) define model OMM (A) = (a, I) 
with the graph A and functions 7 and ttb- A consists of vertices V A = V Ax B , 
called states, and edges 

EA = {(/, c, d) |/ <E EA Ac,d£B} = EA x B x B, 

a : EA — > I is defined as a (/, c, d) = b co( pf) ( d ) • a (/), 7: V A — > / is defined as 
7(p, c) = l ( p ) • b^ (c) and ttb ■ V A — > f? is projection. 

Obviously graph 7l is more complex then primary graph 7l, but general model 
simplifies, because it consists of single probability distribution now instead of 
two ones and single set of vertices instead of two different sets of vertices and 
observables. 
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Forward algorithm for OMM. Denote forward variable a t f/J. o t ) = 

P (o 0 , Ot| (q t , o t ) = (/z, o t ) , OMM (A)), i.e. the probability of observation 
{oo, and at the moment t state is (/z, ot). 

Solution. 1 ) V (n, Oo) G PA : a 0 (/z, o 0 ) = zT (/z, o 0 ) 

2) V< G T = l, V (/x, o t ) G PA : 

a t+ i (n,o t+ i) = ^2 a t (u,o t ) ■ a(v,n) -b,j,(ot+i) 

vEV A/\(v,h)£EA 

= ott{u,o t ) •a((v,fi),0t,0 t+ 1 ) 

v£V AA((is : ii),ot-i,ot)(zEA 

Consider ((/z, z/) , ot,ot+i) = (x, j/) G £A4 it could be rewritten as: 
a t +i{x)= ^2 ot t {y) -a(y,x) 

y£V A/\(y,x)ZEA 

3) P (Ot\OMM (X)) = a r-i (y, ot-i) = J2 a T-i (/b °t-i) 

Z^VA (n,o T ~i)£VA 

Backward algorithm for OMM. Denote backward variable /3* (/z, ot) = 

P (ot+i, O t-i| (qt,ot) = (/x, o*) ,OMM (A)), i.e. the probability of observa- 
tion {ot+i, ..., Ot-i} and at the moment t state is (/x, ot). 

Solution. 1) V (/it, ot-i) G PA : /?t-i (/x, Ot-i) = 1 

2) V< G T^l, V (/x, ot) G PA : 

/M/z,Ot) = ^ /?t+i (z'jOt) • a(/x, z/) • b v (o t+ i) 

veV A/\(n,v)£EA 

= X! A+l (t'.Ot+i) • o((/X,Z/) ,Ot,O t +l) 

z'G Vx4A((^,^),ot,o t +i)GSA 

Consider ((/lx, z/) ,o t ,Ot+i) = (x, j/) G EAIi it could be rewritten as: 

Pt (x) = ^2 Apt (y) ' « Op y) 

yeVAA(x,y)eEA 

3) P (Ot\OMM (A)) = £ T(n,o 0 ) ■ Po (/x) = E J(v,o 0 ) • (3 0 (y,o 0 ) 

n&vA (y,, O0 )eVA 
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Solution of the problem #3, Baum- Welsh algorithm for OMM. 


Let Vf € T — 1, V ((/n, v ) , o tl o t -i) G EA : 


6 ((/L*') ,o t ,o t+ i) = P ((gt,g t +i) = {n,v) \O t , A) 

Qf (/u, o t ) • a (/n, v) ■ b u (ot+i) • A+i (o Qj+i) 

E a t (. P , ot) • a (p, cr) • (o t+ i) • /3 t+ i (<j, o t +i) 

{p,(t)€lEA 

a t (/n, o t ) • a ((/n, , o t , Qt+i) • A+i (^, o t+ i) 

E _ ot t {y)-a (y, z) ■ /3 t +i ( 2 :) 

(y, 2) G EA 

Attb (y) = Ot A 7T S ( 2 ) = o t+ i 

Consider ((/x, 1 /) , Ot, ot+i) = (x, y) G EA and in the sum in the denominator: 
7r b ( y ) = Ot and 7Ts ( 2 ) = Ot+i, it could be rewritten as: 

t / x = at (x) ■ a(x,y ) ■ (3 t (y) 

* t[X ’ y> E a t -!(|/)-5(|/,2)'A(2) 

(v,z)eEA 

Construction of OMM proves the following statement. 

Statement 1. For each HMM A there is exists OMM (A) such that P (Ot|A) = 

P(O t |OMM (A)). 

This means that HMM does not contain any additional features and it hidden 
part could be taken into account by construction of respective OMM wiclr is 
completely observable. 

Conclusion. We have defined the model that is equivalent to the HMM, but 
contains no hidden part and single probability distribution instead of two inde- 
pendent ones appeared in HMM. This allows to redefine HMM in more consis- 
tent mathematical way and to improve major HMM algorithms, though these 
improvements are so trivial that likely has been taken into account in all imple- 
mentations anyway. At the same time proposed model by itself has become less 
intuitive and has lost it direct connection to the problems that have derived it. 
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UniSay for the opportunity to complete this paper. The result has been obtained 
during the design and development of the multimedia processing and speech tim- 
ing system for the UniSay [5]. 
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